POMDP planning and execution in an augmented space

نویسندگان

  • Marek Grzes
  • Pascal Poupart
چکیده

In planning with partially observable Markov decision processes, pre-compiled policies are often represented as finite state controllers or sets of alpha-vectors, which provide a lower bound on the value of the optimal policy. Some algorithms (e.g., HSVI2, SARSOP, GapMin) also compute an upper bound to guide the search and to offer performance guarantees, but they do not derive a policy from this upper bound due to computational reasons. The execution of a policy derived from an upper bound requires a one step lookahead simulation to determine the next best action and the evaluation of the upper bound at the reachable beliefs is complicated and costly (i.e., linear programming or sawtoooth approximation). The first aim of this paper is to show principled and computationally cheap ways of executing upper bound policies which can be even faster than executing lower bound policies based on alpha vectors. The second complementary contribution is a new method to find better upper bound policies that outperforms those obtained by existing algorithms, such as HSVI2, SARSOP, or GapMin, on a suite of benchmarks. Our approach is based on a novel synthesis of augmented and deterministic POMDPs and it facilitates efficient optimization of upper bound policies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Flexible POMDP Framework for Human-Robot Cooperation in Escort Tasks

We describe a novel method for ensuring cooperation between human and robot. First, we present a flexible and hierarchical framework based on POMDPs. Second, we introduce a set of cooperative states within the state-space of the POMDP. Third, for ensuring an efficient scalability, the framework partitions the overall task into independent planning modules. Lastly, for a robust execution of the ...

متن کامل

Monitoring plan execution in partially observable stochastic worlds

This thesis presents two novel algorithms for monitoring plan execution in stochastic partially observable environments. The problems can be naturally formulated as partially-observable Markov decision processes (POMDPs). Exact solutions of POMDP problems are difficult to find due to the computational complexity, so many approximate solutions are proposed instead. These POMDP solvers tend to ge...

متن کامل

Covering Number for Efficient Heuristic-based POMDP Planning

The difficulty of POMDP planning depends on the size of the search space involved. Heuristics are often used to reduce the search space size and improve computational efficiency; however, there are few theoretical bounds on their effectiveness. In this paper, we use the covering number to characterize the size of the search space reachable under heuristics and connect the complexity of POMDP pl...

متن کامل

Planning to see: A hierarchical approach to planning visual actions on a robot using POMDPs

Flexible, general-purpose robots need to autonomously tailor their sensing and information processing to the task at hand. We pose this challenge as the task of planning under uncertainty. In our domain, the goal is to plan a sequence of visual operators to apply on regions of interest (ROIs) in images of a scene, so that a human and a robot can jointly manipulate and converse about objects on ...

متن کامل

Point-Based Policy Transformation: Adapting Policy to Changing POMDP Models

Motion planning under uncertainty that can efficiently take into account changes in the environment is critical for robots to operate reliably in our living spaces. Partially Observable Markov Decision Process (POMDP) provides a systematic and general framework for motion planning under uncertainty. Point-based POMDP has advanced POMDP planning tremendously over the past few years, enabling POM...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014